Learned Random-Walk Kernels and Empirical-Map Kernels for Protein Sequence Classification

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learned Random-Walk Kernels and Empirical-Map Kernels for Protein Sequence Classification

Biological sequence classification (such as protein remote homology detection) solely based on sequence data is an important problem in computational biology, especially in the current genomics era, when large amount of sequence data are becoming available. Support vector machines (SVMs) based on mismatch string kernels were previously applied to solve this problem, achieving reasonable success...

متن کامل

Halting in Random Walk Kernels

Random walk kernels measure graph similarity by counting matching walks in two graphs. In their most popular form of geometric random walk kernels, longer walks of length k are downweighted by a factor of λ (λ < 1) to ensure convergence of the corresponding geometric series. We know from the field of link prediction that this downweighting often leads to a phenomenon referred to as halting: Lon...

متن کامل

Accuracy of String Kernels for Protein Sequence Classification

Determining protein sequence similarity is an important task for protein classification and homology detection. Typically this may be done using sequence alignment algorithms, yet fast and accurate alignment-free kernel based classifiers exist. Viewing sequences as a “bag of words”, we test a simple weighted string kernel, investigating the effects of k-mer length, sequence length and choice of...

متن کامل

Generalized Similarity Kernels for Efficient Sequence Classification

String kernel-based machine learning methods have yielded great success in practical tasks of structured/sequential data analysis. In this paper we propose a novel computational framework that uses general similarity metrics and distance-preserving embeddings with string kernels to improve sequence classification. An embedding step, a distance-preserving bitstring mapping, is used to effectivel...

متن کامل

Efficient multivariate kernels for sequence classification

Kernel-based approaches for sequence classification have been successfully applied to a variety of domains, including the text categorization, image classification, speech analysis, biological sequence analysis, time series and music classification, where they show some of the most accurate results. Typical kernel functions for sequences in these domains (e.g., bag-of-words, mismatch, or subseq...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Computational Biology

سال: 2009

ISSN: 1066-5277,1557-8666

DOI: 10.1089/cmb.2008.0031